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[57] ABSTRACT 

A machine translation system having an idiom processing 
function is disclosed, which includes a keyboard for input- 
ting a word sequence of a first language; a dictionary 
memory for storing therein idioms of the first language 
including at least two fixed portions and a variable portion 
interposed therebetween as headers and translated expres- 
sions in a second language corresponding to the respective 
headers; and a control processor for performing a registra- 
tion process far newly or additionally registering a header of 
the first language and translated expressions in the second 
language corresponding to the header into the dictionary 
memory, a dictionary lookup process for retrieving a header 
corresponding to an idiom included in the word sequence of 
the first language input by means of the keyboard from the 
headers stored in the dictionary memory by comparing the 
word sequence of the first language with each of the headers, 
and an idiom processing process for normalizing an arrange- 
ment of fixed portions in a word sequence of the first 
language which is identified with one of the headers in the 
dictionary lookup process. 
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I have "both" A M ancT B. 

I have -neither A *W B. 

This is "so" hot "that" children curort drink it. 

This is "too" hot "to" drink it 

This is "the same" book "that" you bought. 
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Idiom in English 
Part of speech 
Translation 
Fort of speech 


both *N1 and *N2{1) 
noun phrase 

♦Ni £ *N2QP5tf 

others (2) 


1: *N means a noun phrase 




2: "Others" mcmrifs noninflectic 


mal nouns, adverbs and the like. 
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Idiom in English 
Part of speech 
Translation 
Part of speech 


neither *Nl nor *N2 
noon phrase 

•Nl * *N2* 

others 



10 



MACHINE TRANSLATION SYSTEM HAVING 
IDIOM PROCESSING FUNCTION 

BACKGROUND OF THE INVENTION 

1. Held of the Invention 

The present invention relates to machine translation sys- 
tems and, more particularly, to machine translation systems 
having an idiom processing function which are capable of 
registering, retrieving and translating idioms. 

2. Description of Related Arts 

Conventional language processing systems for practical 
use include word processors for supporting documentation 
and machine translation systems for translating a document 
from one language to another language. 

These language processing systems have a dictionary far 
storing therein a multiplicity of unit items each including a 
header and various kinds of information associated thereto. 
Headers include not only words of a natural language such 
as English or Japanese but also sequences of words (such as 20 
phrases and correlated words) which are taken as a whole to 
express a certain meaning, Le,, idioms. Some consist of 
consecutive words like "high school**, and others consist of 
split words like "so . . . that" (split idioms). 

In a translation process, the interrelation between a split M 
idiom of a source language and an equivalent expression of 
a target language varies case by case and, hence, it is difficult 
to handle the split idioms. Exemplary split idioms are shown 
below: 



Idiom registration for the sentence 0.a\ 



30 

(la) 
(lb) 
(2a) 35 
(2b) 
(2c) 



A conventional machine translation system introduces 40 
representative symbols as shown in FIG. 15 when register- 
ing such split idioms into a dictionary, so that idioms having 
a variable portion consisting of a single word as well as a 
word sequence can be processed for translation. Japanese 
Unexamined Patent Publication HEE 6(1994)-139272 dis- 45 
closes a machine translation system which registers idioms 
by using the representative symbols. 

For example, idioms included in the above sentences are 
registered in the following manner: 



Idiom in English 
Part of speech 
Translation 
Part of speech 



so *A that *c 
adjective 

*c[*o at* 

(3) 



*A 



3: Since the part of speech of the translated Idiom is determined by the part 
of speech ofo word or word sequence *A in the translate the part of speech 
thereof is not specified. 

Idiom registration for the sentence (2b) 



15 



50 



55 



60 



65 



Idiom in ftn gijcn 
Part of speech 
Translation 
Part of speech 



too«ato*I 
adjective 

(4) 



fc'JJC *a 



4: Since the part of speech of the translated idiom is determined by the put 
of speech of a ward or ward sequence *a in the translation, the part of speech 
thereof is not specified. 

Idiom registration for the sentence (2c) 



fti i or n in Knjlish 
Part of speech 
Translation 
Part of speech 



the same *n that *C 
noun 

*C[*:} ©£H 1/ 

(5) 



5: Since the part of speech of the translated idiom is determined by the part 
of speech of a word or word sequence *n in the translation, the part of speech 

thereof is not specified. 

In tins prior art, an idiom matchable with an input word 
sequence is retrieved from the idioms registered as shown 
above in the dictionary, and the syntax of words or word 
sequences represented by representative symbols is ana- 
lyzed. Based on the result of the syntactical analysis, a 
translation corresponding to the idiom in the input word 
sequence is generated by using a translated expression 
registered as shown above. 

Where a variable portion separating the constituents of the 
idiom is a ward sequence or a phrase, Le., where a repre- 
sentative symbol (a non-terminal representative symbol as 
shown in FIG. 15) representing a word sequence in an idiom 
header corresponds to the variable portion, the translation 
system of the prior art cuts out the variable portion to 
perform a syntactical analysis only for the variable portion. 

Then, if the syntactical analysis is successfully completed, 
the translation system generates a translation of the variable 
portion, and inserts the translation into a translated expres- 
sion of the idiom header. After a portion corresponding to 
the idiom is thus translated, the system translates the entire 
sentence by a recursive process. 

A prior-art translation system having no such special 
registration means requires special rules as shown below to 
process split idioms. 

In the sentence (la), for example, a word sequence "both 
A and B" which is an object of the verb "have" functions as 
a noun phrase. Therefore, a rule is required to define the 
syntax of the noun phrase as follows: 

Noun phrase-*correlative word 1+noun pbrase-fcoiielative word 
2+noun phrase 

where the correlative words 1 and 2 correspond to words 
"both" and "and", and the first and second noun phrases 
correspond to "A" and **B" ( respectively, in the sentence 
(la). 

Thus, the rule is designed so as to include special parts of 
speech such as the correlative words 1 and 2. In case of the 
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to the word "both" corresponding to the correlative ward 1, words *is" and "designed" constitute a passive-voice verbal 
and a translation "t"(to) is assigned to the word "and" phrase and words "so** and "that" constitute a subordinate 
corresponding to the correlative word 2. conjunction, the syntactical relation of the former crosses the 

Then, a translation "A t B ®S*"(A to B no ryouhou) is 5 s > rntactical rdatio11 of me mcL Therefore, a grammatical 
assigned to this idiom. ^ for processing such a syntax cannot be described. 

In case of the sentence (2a), a word sequence "so hot that „ ^ecifically, assuming that there is a sentence 

children cannot drink it" functions as a complement of a abcd C0nsistm 8 of four words > m which " a " <fc*dy 
word "is". Therefore, another rule is required to define the relates t0 and v to " d "> *° lowing 

syntax of the adjective phrase as follows: 10 ***** grammatical rules can be applied to this sentence. 

X->ab ("a** relates to "b M , and "X" is generated by 
arranging "a" and "b" in this order.) 
Adjective phrase-Konthtive word warf Y ->cd (V* relates to "d M , and "Y" is generated by 

^ arranging "c" and "d" in this order.) 

where the correlative words 1 and 2 correspond to words Z-»XY ("X" relates to "Y", and "Z" is generated by 
"so" and "that", respectively. arranging "X" and "Y" in this order.) 

Similarly, still another rule is required to define the syntax Z=[X, Y]=[(a, b), (c, d)]=abcd 
of the adjective phrase in the sentence (2b) as follows: However, if "a" relates to "c" and "b" relates to "d", Le., 

2Q the syntactical relation of the former crosses the syntactical 
...... , . , „ Jt . , . relation of the latter, the syntax of this sentence cannot be 

i+adjccuv^kuve wori described in accordance with the aforesaid grammatical 
rules. 

Yet another rule is required to define the syntax of the The conventional machine translation system requires 
noun phrase in the sentence (2c) as follows: 25 numerous grammatical rules to describe the syntactical 

relation between separated words in a split idiom. Even if 
exceptional rules are prepared to process split idioms, the 
Noun phraso-Kttirclativc word l-tooua-hcorTclative word Clause syntax such as of the sentence (3a) cannot be properly 
(no object) analyzed for correct translation. This is because sentences of 

To merely cover the aforesaid five idiomatic expressions, 30 a source language may include various kinds of split idioms 
four rules arc required anti increased number of exceptional processes are required 

Hie correlative word 1 should be compatible with the for me Fusing of these split idioms. 
c^dativeword2in each of the idiomatic expressions, and SUMMARY OF THE INVENTION 

the combination thereof is predetermined. In the sentences 

(la) and (lb), for example, the word sequences "both A and 35 *n accordance with the present invention, there is pro- 
B" and Neither A nor B" are correct, but a word sequence vided a machine translation system having an idiom pro- 
"both A nor B" is incorrect That is, the words "both" and cessing function which comprises: an input means for input- 
neither" are compatible with "and" and "noi", respectively. ting a word sequence of a first language; a dictionary means 
Therefore, additional information concerning the compat- f° T storing therein idioms of the first language including at 
ible combination of the correlative words 1 and 2 should be 40 least two fixed portions and a variable portion interposed 
described in the dictionaiy. therebetween as headers and translated expressions in a 

In the conventional translation system, idiom headers second language corresponding to the respective headers; a 
registered in the dictionary can include a variable portion in registration means for newly or additionally registering a 
the form of phrase and, therefore, the generality of the idiom header of the first language and translated expressions in the 
headers can significantly be improved. However, the match- 45 second language corresponding to the header into the dic- 
ing of the variable portion is performed only after a phrase tionary means; a dictionary lookup means for retrieving a 
corresponding to the variable portion is extracted from a header corresponding to an idiom included in the word 
source text and subjected to a syntactical analysis process, sequence of the first language input by the input means from 
transformation process and generation process. the headers stored in the dictionary means by comparing the 

If the syntactical analysis of the variable portion fails, the 50 word sequence of the first language with each of the headers; 
syntactical analysis process for the variable portion is recur- and an idiom processing means for normalizing an arrange- 
sively performed many times and, therefore, the overhead raent of fixed portions in a word sequence of the first 
for this process is increased. More specifically, as idiom language which is identified with one of the headers by the 
headers having variable phrases are increasingly registered dictionary lookup means. 

in the dictionary, more time is required for the matching of 55 According to the present invention, the arrangement of 
an idiom header. Therefore, the throughput of the entire the fixed portions of the idiom included in the word 
translation process may be reduced by the registration of an sequence which is identified with one of the headers in the 
idiom header including a variable portion of an unexpected dictionary is normalized. Therefore, a translation of a split 
syntax* idiom can be generated by using ordinary grammatical rules 

Further, as previously stated, the provision of numerous 60 without defining any special grammatical rule far the trans- 
special exceptional rules may be disadvantageous in terms lation process. Further, even if one syntactical relation 
of the throughput Even if all the rules such as the aforesaid between words constituting the idiom crosses another syn- 
grammatical rules possibly required for the translation pro- tactical relation, a correct translation can be generated, 
cess are prepared and the information concerning the com- Furthermore, the dictionary means is adapted to store 
patible combination of correlative words is registered for 65 therein the idioms of the first language in such a mannermat 
each of the idiom headers in the dictionary, the following a principal fixed portion of each of the idioms can be 
problems will arise. distinguished from a supplementary fixed portion thereof, 
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and the dictionary lookup means is adapted to generate 
part-of-speech information fox each word and syntactical 
information including type information assigned to the fixed 
portions in the word sequence of the idiom identified with 
one of the headers for distinguishing the principal fixed 
pardon from the supplementary fixed portion and pointer 
information indicative of interrelation between words in the 
input word sequence. Therefore, there is no need to define 
any special grammatical rule, and a split idiom can be 
translated in the same manner as an unspHt idiom which can 
be processed by using ordinary grammatical rules. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram illustrating a fundamental 
structure of a machine translation system in accordance with 
the present invention; 

FIG. 2 is a block diagram illustrating a structure of a 
machine translation system in accordance with one embodi- 
ment of the present invention; 

FIG. 3 is block diagram for explaining each function; 

FIG. 4 is a diagram for explaining an exemplary regis- 
tration of idioms; 

FIG. 5 is a diagram for explaining another exemplary 
registration of idioms; 

FIG. 6 is a diagram for explaining an exemplary repre- 
sentative symbol table; 

FIG. 7 is a flow chart illustrating a dictionary lookup 
process and morpheme analyzing process in accordance 
with the present invention; 

FIG. 8 is a flow chart illustrating an idiom processing 
process in accordance with the present invention; 

FIG. 9 is a schematic diagram for explaining information 
retained in a dictionary lookup result buffer A; 

FIG. 10 is a schematic diagram for explaining information 
retained in a dictionary lookup result buffer A; 

FIG. 11 is a schematic diagram for explaining information 
retained in the dictionary lookup result buffer A after the 
idiom processing process; 

FIG. 12 is a schematic diagram for explaining information 
retained in the dictionary lookup result buffer A after the 
idiom processing process; 

FIG. 13 is a schematic diagram for explaining information 
retained in the dictionary lookup result buffer A; 

FIG. 14 is a schematic diagram for explaining information 
retained in the dictionary lookup result buffer A after the 
idiom processing process; 

FIG. 15 is a diagram for explaining prior-art representa- 
tive symbols; and 

FIG. 16 is a diagram for explaining a case where one 
syntactical relation crosses the other syntactical relation. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

In view of the foregoing, it is an object of the present 
invention to provide a machine translation system which has 
a function of processing a split idiom including a word 
sequence (or phrase) as a variable portion by expressing the 
split idiom by an idiom header having fixed portions and 
variable portion and normalizing a word arrangement of the 
fixed portions. Such machine translation system does not 
require special grammatical rules, nor a recursive process 
such as of the prior art to process candidate phrases far the 
variable portions of the split idiom in an input source text 
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FIG. 1 is a block diagram illustrating a fundflmental 
structure of a machine translation system in accordance with 
the present invention. 
As shown, a machine translation system having an idiom 

5 processing function includes: an input means 1 for inputting 
a word sequence of a first language; a dictionary means 3 for 
storing therein idioms of the first language including at least 
two fixed portions and variable portion interposed therebe- 
tween as headers and translated expressions in a second : 

to language corresponding to the respective headers; a regis- 
tration means 2 far newly or additionally registering a 
header of the first language and translated expressions in the 
second language corresponding to the header into the dic- 
tionary means; a dictionary lookup means 4 for retrieving a 

15 header corresponding to an idiom included in the word 
sequence of the first language input by the input means from 
the headers stored in the dictionary means 3 by comparing 
the word sequence of the first language with each of the 
headers; and an idiom processing means 5 for normalizing 

20 an arrangement of fixed portions in a word sequence of the 
first language which is identified with one of the headers by 
the dictionary lookup means 4. 

Preferably, the dictionary means 3 is adapted to store 
therein the idioms of the first language in such a manner that 

25 a principal fixed portion of each of the idioms can be 
distinguished from a supplementary fixed portion thereof, 
and the dictionary lookup means 4 is adapted to generate 
speech- of -part information for each word and syntactical 
information including type information assigned to the fixed 

30 portions in the word sequence of the idiom identified with 
one of the headers for distinguishing the principal fixed 
portion from the supplementary fixed portion and pointer 
information indicative of interrelation between words in the 
input word sequence. 

Preferably, the idiom processing means 5 is adapted to 
prepare information for presuming that a word correspond- 
ing to the supplementary fixed portion selected from words 
identified with the fixed portions of the header of the idiom 

^ has been moved to a position of a word corresponding to the 
principal fixed portion in the input word sequence and add 
the information to a word other than the word corresponding 
to the supplementary fixed portion to transform the syntac- 
tical Mormation when normalizing the arrangement of fixed 

Ae portions of the idiom. 

45 

Alternatively, the idiom processing means 5 may be 
adapted to prepare information for presuming that a word 
corresponding to the supplementary fixed portion selected 
from words identified with the fixed portions of the header 

50 of the idiom has been deleted from the input word sequence, 
and add the information to a word other than the word 
corresponding to the supplementary fixed portion to trans- 
form the syntactical information when normalizing the 
arrangement of the fixed portions of the idiom. 

5S Preferably, a translation generating means generates a 
translation of a second language from the syntactical infor- 
mation transformed by the idiom processing m^anc 5 

And, the translation is output by a output means which 
includes a display, printer and the like. 

60 Exemplary input devices used as the input means 1 shown 
in FIG. 1 include a keyboard, pointing devices and the like, 
but not limited thereto. Exemplary memory devices used as 
the dictionary means 3 include an ROM, RAM and flexible 
disk, hard disk and the like, but not limited thereto. 

65 The dictionary means 3 is of the type which is commonly 
used for translation and serves to store therein headers each 
including a word or a word sequence of the first language 
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FIG. 2 is a block diagram fiiusiraiing a structure of a 

Other than such information, the dictionary means 3 may machine translation system in accordance with the embodi- 

include part-of-speech information and infoimadon required ment of the present invention. 

for retrieval of a header. Preferably, the dictionary means 3 The machine translation system includes a main CPU 

is designed so that a user can additionally register a header 3 (central processing unit) 21, a bus 27 through which the 

and a translated expression corresponding thereto, or update main CPU 21 and other components are connected with each 

the dictionary registration. other, a main memory (including memory buffers) 22 con- 

Typically employed as the registration means 2, dictio- nected to the bus 27, a display device 23 (CRT (cathode ray 

nary lookup means 4 and idiom processing means 5 is a CPU tube), LCD (liquid crystal display) or the Hke) connected to 

or microprocessor including peripheral circuits such as an 10 ^ bus ^» a keyboard 24, a translation module 25 connected 

ROM, RAM and I/O interface. Programs for controlling the t0 ^* bus ^ 811(1 30 external memory 26 connected to the 

operations of the machine translation system are preferably translation module 25 for storing therein a translation 

incorporated in the ROM or RAM. dictionary, syntactical analysis grammatical rule, transfer- 

Practically, the machine translation function realized by ^n grammatical rule and generation granmiatical rule. 

MTxa^u^uiy, ui* muiu-uiuk uou^u xuatuuu xc«uwcu uy The translation module 25 serves to translate an input 

these means above mentioned in tins invention is incorpo- 15 9emm fro m a source t0 a mget language in 

rated in word processors, personal computers or exclusive accordance ^ a ^determined procedure, 

machines for translation. FIG. 3 is a block diagram illustrating the structure of the 

Idioms herein defined include idiomatic expressions, idi- translation module 25 of the machine translation system in 

omatic phrases, set phrases, phrases consisting of correlative accordance with the present invention, 

words and the like which are commonly used in daily life. 20 A source text inputting section 31 serves to accept an 

Idioms of the first language including at least two fixed input source text to be translated, and corresponds to the 

portions and a variable portion interposed therebetween are keyboard 24 shown in FIG. 2. 

herein called "split idioms". A split idiom, for example, has A dictionary registration section 32 serves to add, modify 

a first fixed portion, first variable portion, second fixed and delete dictionary information such as dictionary headers 

portion and second variable portion arranged in this order, 25 and translated words or expressions stored in a memory 

and the first and second fixed portions thereof are thus section 33, and is embodied by the main CPU 21 shown in 

separated. FIG. 2. 

The fixed portion is a predetermined ward or word The memory section 33 includes a translation dictionary 

sequence (fixed word or word sequence) in an idiom, and the 33a, buffer memory 33b, syntactical analysis grammatical 

variable portion is a word or word sequence (variable word 30 ru ^ c 33c, transformation grammatical rule 33a* and genera- 

or word sequence) which varies depending on an input ^ on grammatical rule 33e> and corresponds to the main 

sentence. memory 22 and external memory 26 shown in FIG. 2. 

The principal fixed portion is a word or word sequence ^ mam memory 22 serves to store therein various kinds 

which plays the most important part in the plural fixed of iDfonnation utilized in the machine translation system, 

portions of the idiom when the idiom is translated, and the 33 311(1 3X1 RAM & typically employed as the main memory 22. 

supplementary fixed portion is a portion other than the *^ ne external memory 26 serves to store therein the 

principal fixed portion. translation dictionary 33a, syntactical analysis grammatical 

In a split-idiomatic expression "so hot that children can- ^ 33c - transformation grammatical rule 33a* and genera- 

not drink it", for example, words "so** and "that" are fixed tion grammatical rule 33c; and a hard disk or flexible disk is 

portions, and a word "hot" and a clause "children cannot 40 employed as the external memory 26. 

drink it* are variable portions. Of the fixed portions, the A taction outputting section 38 serves to output a 

word "so" is a supplementary fixed portion, and the word translation generated by the translation module, and corre- 

"thaf * is a principal fixed portion. s P on . ds t0 me ^P 1 *? device 23 snown m 2 ' Pinter or 

Normalizing an arrangement of fixed portions means that , ^li?^' . . , v . m „ ^ . . . 
a word or word sequence of one fixed portion in a position 45 ^ ******* 25 shown in FIG. 2 includes a 
apart from another fixed portion is deleted or moved within *<* onai y loojmp/morpheme analyzing section 34, syntax 
a splitidiom, or a variableportionismovedso thatthe idiom transfoimat1011 sectlon 36 ^d genera- 
can be regard as havmg a smglefixedportion, and then the on section 37. 

resultant word arrangement isltored. dictlonar y lookup/morpheme analyzing section 34 
- „ ftwmU ° n „ AmoBf ^ f , „ .„ . ii . ,50 serves to divide the source text input from the source text 

For example, the arrangement of the words in the input m ^ ^ 31 . mar ^ cm ^ (wQrd Qccs) ^ 

word sequence is normalized by performing at least one of V, ! * ^ . 7 ^ T/ H . , ,v ^ " l T ca 

the followinc processes- morphemic information including grammatical infer- 

/i\ * f-T ^ j * . . . . , . ^ . , mation such as a part of speech of each word and a translated 

(1) Any of the feed portions included in the input word expression corresponding to the word, and analyze infor- 
sequence is deleted; 55 mation such as of a tense, person and number. 

(2) Plural words or word sequences corresponding to the The syntax analyzing section 35 serves to detennine a 
variable portions in the input word sequence are moved syntactical analysis tree indicative of syntactical relations 
to different positions; between respective words in the source text in accordance 

(3) The first fixed portion included in the input word with the obtained morphemic information and the grammati- 
sequence is moved just before the second fixed portion; 60 cal rules. The transformation section 36 serves to transform 

the syntactical analysis tree for the input source text into a 

(4) The second fixed portion included in the input word syntactical analysis tree for a translated text The generation 
sequence is moved just after the first fixed portion. section 37 serves to build a sentence structure for the target 

There will hereinafter be detailed the present invention by language in accordance of the generation grammatical rule 

way of an embodiment thereof with reference to the attached 65 for the target language, then add appropriate particles and 
drawings. It should be noted that the embodiment is not- auxiliary verb to generate a correct translation and output the 

limitative of the invention. generated translation. 
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The dictionary loobip/morpheme analyzing section 34 charts shown in PEGS. 7 and S. It is herein assumed that the 

includes a dictionary lookup section 34a for searching the sentence (la) 1 have both A and iFis input as a"source 

translation dictionary 33a in the memory section 33, a sentence from the source text inputting section 31 shown in 

morpheme analyzing section Mb for performing a mor- HG. 3. It is ftaherassiimed mat sp 

pheme analyzing process for the source text based on 5 4 and 5 are registered into the translation dictionary 33a. 

information obtained from the dictionary, and an idiom In step S2, the translation dictionary 33a shown in FIG. 3 

processing section 34c for processing a split idiom. is searched far a character string in the source sentence (la) 

Hie idiom processing section 34c includes an idiom input in step SL If the part of speech of a retrieved header 

searching section 34ol and a dictionary lookup result buffer is not "CD" (split idiom), the process goes into step SS from 

modifying section 34c-2 for deleting or moving a word 10 judgment step S3 t and dictionary information of the 

corresponding to a supplementary fixed portion selected retrieved header is retained in the dictionary lookup result 

from words of the fixed portions of the idiom. buffer A 40 reserved in the buffer memory 33* of the 

Though not shown in the block diagram, the translation memory section 33 shown in FIG. 3. 

module 25 of the aforesaid structure includes a translation In judgment step S6, it is judged if the input sentence has . 

CPU for performing operations in the respective sections 15 been processed up to me end thereof (or up to the end portion 

within the translation module, Le., for performing a socalled represented by "."). If NO, the process returns to step S2, 

translation process, a program memory for storing therein a and steps S2 through S6 are repeated. When the translation 

program for translation process, and a buffer for storing dictionary 33a is searched for a character string "both" in the 

therein information such as part-of-speech information and input sentence, a registered header [la] "both *N1 and_* 

translated words necessary for the execution of the transla- 20 *N2" shown in FIG. 4 is retrieved, 

tion process. Since the part of speech of the header is XD" (split 

Preferably, an ROM is employed as the program memory, idiom), the process goes into step S4, and it is checked if the 

and an RAM is employed as the buffer. The functions of the other fixed portion defined in the Englis h header in the 

respective sections in the translation module 25 are per- dictionary, i.e., a character string "and__*" is present in a 

formed by the translation CPU. 25 position behind a currently processed word position in the 

A so-called MFU (multiprocessing unit) including an source sentence, to check the applicability of the retrieved 

RAM, ROM, input/output interface and timer can otherwise split idiom. In the character strings defined in the header, 

be used as the main CPU 21 and translation CPU. character strings that begin with "*" are variable portions! 

FIGS. 4 and 5 are diagrams for explaining an exemplary and character strings other than the variable portions are 

idiom registration process in which split idioms are regis- 30 fixed portions. A mark in the character string "and_*" 

tered in the translation dictionary by the dictionary regis- is an identifier indicative of a principal fixed portion. In mis 

tration section 32 in accordance with this emb<)diment case,it is judged if a character string excluding the identifier, 

FIG. 6 is a diagram for explaining an exemplary repre- i.e., "and" is present in the source sentence, 

sentative symbol table in accordance with this embodiment In judgment step S4, it is judged that "and* defined as the 

For example, a symbol "*n" means a character string of a 35 fixed portion is present in the source sentence, and the 

single noun word, and a symbol "*N" means a noun phrase process goes into step S5. This dictionary information is 

including one or more words. retained in the dictionary lookup result buffer A 40. 

Word sequences [la], [lb], [2a], [2b], [2c] and [3a] of In step S5, a pointer indicative of the position of the next 

split idioms shown in FIGS. 4 and 5 are registered for fixed portion is set to "4/0", since the part of speech of the 

processing the split idioms included in the aforedescribed 40 header is "CD". 

sentences (la), (lb), (2a), (2b), (2c) and (3a), respectively. Generally, a pointer x/y indicates a word position x and 

"English" representing a registered English header part-of-speech candidate y. The character string "both" in 

includes, for example, an English word sequence [la] "both the source sentence does not serve as a principal fixed 

*N1 and* *N2", in which the words •'both" and "and" are portion in the header "both *N1 and_* *N2" and, therefore 

fixed portions, and symbols "*N1" and "*N2" are variable 45 a flag indicative of the type of the fixed portion (principal 

portions in the header and are defined as nouns or noun fixed portion "P" or supplementary fixed portion "S") is set 

Vtottts- to "S". If a character string serves as a principal fixed 

When a plurality of variable portions are represented by portion, the flag is set to "P". 

the same representative symbol, the representative symbol The steps S2 through S6 are repeated until the input 

accompanies numerals 1,2,... sequentially assigned to the 50 sentence is processed up to the end portion Then, the 

respective variable portions from the first variable portion to dictionary information is generated in the dictionary lookup 

indicate the interrelation between each of the variable par- result buffer A as shown in FIG. 9. 

tions in the idiom and that in a translated expression. A mark The process sequence from step SI to step S6 is per- 

attached at the end of the word "and" indicates that the formed by the dictionary lookup section 34a. 

wc*d"and"isaprincipalnxedport^^ 55 In FIG. 9, numerals shown in "Word position" indicate a 

a fixed portion that is not subject to deletion. numerical order of respective words in the input source 

"English part of speech" indicates an English part of sentence retained in the buffer- 
speech assigned to each of the headers. "English part of Numerals 0, 1, 2 and 3 shown in Xandidate" indicate 
speech 2" indicates an English part of speech inserted in a candidate numbers assigned to probable parts of speech of 
position of a principal fixed portion after a supplementary 60 each of the character strings in the input source sentence 
fixed portion is deleted. "Translation attribution" indicates which are retrieved in the dictionary lookup process. For 
an attribution to be assigned to a translated expression when example, a word "have" has two part-of-speech candidates, 
the header is employed. In case of the sentence (lb), for Information of "Word mimber", "Part of speech", **iype" 
example, the attribution serves as an instruction for the and "Pointer" is retained in the buffer for every' part-of- 
generation section to generate a negative sentence. 65 speech candidate of respective words. 

Next, there will be described the translation process in A numeral shown in "Word number" indicates the number 

accordance with the present invention with reference to flow of words included in a word sequence or a word registered 
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counted as one word words "so" and '"that" constitute a subordinate conjunction. 

Since the source sentence (la) is divided word by word, Since the syntactical relation of the former crosses the 
all the ward numbers are one as shown in FIG. 9. If a word syntactical relation of the latter, the prior-art system cannot 
sequence resulting from the division of the source sentence 5 correctly translate the sentence (3a), as previously men- 
consists of a plurality of words, the word number is repre- tioned In accordance with the present invention, when the 
sented by the number of spaces between the words plus one. sentence (3a) is input the translation dictionary 33c is 

For example .the ( Wordnumber of a word sequence "high searched for a registered neader [3a] ^ shoWD in FIG. 5 in 

speed machme is 3 . Theword number indicates a word gte S2 shown m mQ 7 afld ft ^ y QQ ^ result 

muiecd^^ l0 bu ffcr A-3 shown in HG. 13 is generated 

sequence liigh speed machine having a word number 3 , - . ... .~ », . , ... VT 

is regardedToieword and followed by the subsequent . . 1 ^/ l ?£ dl ? M A " a ^|??T2 a 

WQ£ £ J * 2 is VD (spht idiom), and the type thereof is "S" indicative 

Codes shown in "Part of speech" indicate possible parts of ^,l su ?? 1 f me ^ tary fi *? d PJ»tion. T^e pointer thereof is 

speech assigned to each of the words in the source sentence 4/0 wtuch mdlcates mat me ward 50 rcla tes to » P*t- 

as shown in FIG. 4. , iype" indicates whether each of the 15 of-speech candidate No. 0 of the word •that" in a word 

words serves as a principal fixed portion or as a supplemen- position No. 4. 

tary fixed portion in the header, as previously mentioned If After this dictionary lookup process, a morphemic analy- 

the word serves as the principal fixed portion, the type is set sis is carried out in a conventional way to determine gram- 

to "F\ and if not, the type is set to "S". matical attribution such as number, person and tense of each 

Alternatively, the type flag may be set to "1" far the type 20 ward in step S7 shown in FIG. 7. 

"P", and to "0" for the type "ST. Then, the idiom processing section 4c performs a process 

"Pointer" indicates the position of the next fixed portion sequence from step S8 to step S15 shown in FIG. 8. The 

as stated above. In FIG. 9, a pointer "4/0" indicates mat the information retained in the dictionary lookup result buffer 

next fixed portion is a part-of-speech candidate No. 0 of the shown in FIG. 9 or 10 is input to the idiom processing 

word "and" in a word position No. 4, ic, the ward "and" 25 section 4c. 

which has "Word numbex"=l and "Speech part"=CC In step S8, a word position counter is reset to "0". Then, 
When the sentence (2a) "This is so hot that children it is checked if a word located in a currently pointed word 
cannot drink it" is input in step SI, the sentence (2a) is position has a part-of-speech candidate of "CD". If NO, the 
processed in substantially the same manner as described process goes into step S14. Then, the ward position counter 
above by the dictionary lookup section 34a, and a dictionary 30 is incremented, and the next ward is checked 
lookup result buffer A-2 as shown in FIG. 1+ is generated Referring again to the first case where the sentence (la) is 
More specifically, in step S2, the translation dictionary processed, when the word position counter is incremented to 
33a shown in FIG. 3 is searched for a character string in the **2" (which indicates a word position No. 2) in the dictionary 
sentence (2a) input in step SI. If the part of speech of a lookup result buffer A-l, the process goes into step S10, 
retrieved header is not "CD" (split idiom), the process goes 35 because the word "both" in the word position No. 2 has a 
into step S5 from judgment step S3, and dictionary infor- part-of-speech candidate having "CD". In step S10, new 
mation is stored in the dictionary lookup result buffer A 40 part-of-speech candidates are generated for the word 4 *have" 
shown in FIG. 3. in a word position just before the currently pointed word 
The steps S2 through S6 are repeated until the input **both" by copying existing part-of-speech candidates of the 
sentence (2a) is processed up to the end thereof (or up to the 40 word "have" and then the ward numbers of the new part- 
end portion represented by "."). When the translation die- of -speech candidates are rewritten to "2" (the word number 
tionary 33a is searched for a character string "so" in the of the word "have" (Le., 1) plus the word number of the 
input sentence (2a) in step S2, a registered header [2a] "so word "both" (i.e., 1)). 

*A that_* *C" shown in FIG. 4 is retrieved Since the part FIG. U shows the information included in the dictionary 

of speech of the header is "CD" (split idiom), the process 45 lookup result buffer A-l after the generation of the new 

goes into step S4, and it is checked if the other fixed portion part-of-speech candidates for the word "have". As shown, 

defined in the English header in the dictionary, Le., a word information of part-of-speech candidates No. 0 and No. 1 of 

"that" is present in a position behind a currently processed the word "have" are copied to part-of-speech candidates No. 

word position in the source sentence, to check the applica- 2 and No. 3, respectively, and the word numbers of the 

bility of the retrieved split idiom. so part-of-speech candidates No. 0 and No. 1 are rewritten to 

In judgment step S4, it is judged that the word "that" "2" 

defined as the fixed portion is present in the source sentence, That is, a word sequence "have both" consisting of the 

and the process goes into step S5. Then, this dictionary two words "have" and "both" is considered to be a single 

information is retained in the dictionary lookup result buffer ward group and looks as if the word "both" had been 

A 40. 55 deleted 

In step S5, a pointer indicative of the position of the next Thus, the part-of-speech candidate No. 1 of the word 

fixed portion is set to "4/0", since the part of speech of the "have" in a word position No. 1 which has a part-of-speech 

header is "CD". The character string "so" in the source of "VB" directly relates to a word "A", skipping the word 

sentence does not serve as a principal fixed portion in the "both". There are generated two part-of-speech candidates 

header "so *A that__* *C" and, therefore, the flag indicative 60 far excluding the word "both" (part-of-speech candidates 

of the type of the fixed portion is set to "S". No. 0 and No. 1 in the word position No. 1) and two 

The steps S2 through S6 are repeated until the input part-of-speech candidates for including the ward "bom" 

sentence is processed up to the end portion ".". Then, the (part-of-speech candidates No. 2 and No. 3 in the word 

dictionary information is generated in the dictionary lookup position No. 1), Le., the word "have" has four part-of-speech 

result buffer A-2 shown in FIG. 10. 65 candidates in total. 

In the sentence (3a) "This is so designed that everyone can In step SU, an English part of speech 2 "CC" of the 

operate it easily." shown in FIG. 16, words "is" and header ''both *N1 and_* *N2" is inserted as a part of speech 
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of the part-of-speech candidate No. 0 of the word "and" in position No. 2) in the dictionary lookup result buffer A-2, a 

the word position No. 4, which is indicated by a pointer new part-of-speech candidate is generated for a word "fe" 

"4/0" (shown in FIG. 9) of a part-of-speech candidate No. 0 located in a ward position just before the currently pointed 

(having a part of speech "CD") of the word "both" in the word "so" by copying an existing part-of-speech candidate 

currently pointed word position No. 2. 5 of the word "is" and then the word number of the new 

FIG. 11 shows a state of the dictionary lookup result part-of-speech candidate is rewritten to "2" (the word num- 

buffer A-l after "CC" is inserted. In this case, however, the ber of the ward "is" (ie., 1) plus the ward number of the 

part of speech of the ward "and" is initially "CC" as shown word "so" (i.e., 1)). 

in FIG. 9 and, hence, there is virtually no change in the part Thus, a part-of-speech candidate No. 1 of the word "is" in 

of speech thereof, 10 the word position No. 1 which has a part of speech "BE" 

m ^ mC P art " of - s P cccn candidate No. 0 of the directly relates to a word "hot", slapping the word "so", 

word "both" in the currently pointed word position (shown There are generated a candidate for excluding the word "so" 

in FIG. 9) is deleted, since a flag thereof is set to "S" (part-of-speech candidate No. 0 in the word position No. 1) 

indicative of a supplementary fixed portion. Then, the rest of and a candidate for including the word "so" (part-of-speech 

the part-of-speech candidates are moved to the left to fill the 15 candidate No. 1 in the ward position No. 1). 

vacant position. In ste p SU, a second part of speech in English "AC" of 

More specifically, the candidate of the word "both" hav- the header "so * A that_* *C" is inserted as a part-of-speech 

ing a part-of-speech "CD" shown in FIG. 9 is deleted, and candidate No. 0 of the word "that" in a word position No. 4, 

merest of the part-of-speech candidates aremovedto the left which is indicated by a pointer "4/0" (shown in FIG. 10) of 

to fill the vacant position as shown in FIG. 11. 20 a part-of-specch candidate No. 0 (having a part of speech 

However, the part-of-speech candidates of the words "CD") of the word "so" in a currently pointed word position 

added into the word positions No. 1 and No. 4 are not No. 2. FIG. 12 shows a current state of the dictionary lookup 

necessarily compatible with all the part-of-speech candi- result buffer A-2 after "AC" is inserted, 

dates in the other word positions. The part-of-speech can- In step S12, a part-of-speech candidate No. 0 of the word 

didates No. 0 and No. 1 of the word "have" should be 25 "so" in a currently pointed word position (shown in FIG. 10) 

compatible with the part-of-speech candidate No. 0 of the is deleted as shown in FIG. 19, since a flag is set to "S" 

ward "and", while they are incompatible with the part-of- indicative of a supplementary fixed portion. Then, the rest of 

speech candidate No. 1 of the word "and". the part-of-speech candidates are moved to the left to fill the 

Therefore, a pointer indicative of an interrelation between vacant position, 
the words (either compatible interrelation or incompatible 30 However, the part-of-speech candidates of the words 
interrelation) is set for each of the part-of-speech candidates added into the word positions No. 1 and No. 4 are not 
of the wards "have" and "and" in the word positions No. 1 necessarily compatible with all the part-of-speech candi- 
and No. 4. More specifically, the pointers of the part-of- dates in the other word positions. The part-of-speech can- 
speech candidates No. 0 and No. 1 of the word "have" are didate No. 0 of the ward "is" should be compatible with the 
set to "4/0", and the pointer of the part-of-speech candidate 35 part-of-speech candidate No. 0 of the word 'that", while it 
No. 0 of tiie word "and" is set to "1/0" and "Ul". is incompatible with the part-of-speech candidates No. 1, 

The pointer indicates a compatible interrelation, and is No. 2 and No. 3 of the word "that", 

represented by x/y, which means the part-of-speech candi- Therefore, in step S13, a pointer indicative of an interrc- 

date of a word is compatible with a part-of-speech candidate lation between the words (either compatible interrelation or 

y of another word in a ward position "x". The state of the 40 incompatible interrelation) is set far each of the part-of- 

dictionary lookup result buffer A-l after the pointer is set is speech candidates of the words 'Is" and "that" in the word 

shown in FIG. 11. positions No. 1 and No. 4. 

As shown, the part-of-speech candidate No. 0 of the word In FIG. 12 fllustrating the state of the dictionary lookup 

"and" having a part of speech "CC" is compatible with the result buffer A-2, the part-of-speech candidate No. 0 of the 

part-of-speech candidates No. 0 and No. 1 of the word 45 word "tot" having a part of speech^ 

•liave", while it is incompatible with the part-of-speech the part-of-spcech candidates No. 0, while it is incompatible 

candidates No, 2 and No. 3 of the ward "have". with the part-of-speech candidate No. 1 of the word "is". 

In step S14, the word position counter is incremented, and In step S14, the word position counter is incremented, and 

it is checked if "CD" exists in the rest of the source text it is checked if "CD" exists in the rest of the source text 

When the source text is processed up to the end thereof, 50 Then, the steps S9 through SIS are repeated until the source 

steps S16 through S19 are performed, and the idiom pro- text is processed up to the end thereof, 

cessing process ends. Referring again to the third case where the sentence (3a) 

As described above, the steps S8 through S15 are per- 'This is so designed that everyone can operate it easily", the 

formed by the idiom processing section 34c. More dictionary lookup result buffer A-3 shown in FIG. 13 is 

specifically, the steps S8 and S9 are performed by the idiom 55 modified as shown in FIG. 14. 

searching section 34c-l and the steps S10 through S14 are As shown in FIG. 14, a part-of-speech candidate is 

perf armed by the dictionary buffer modifying section 34c-2. inserted as a part-of-speech candidate No. 0 of a word "that" 

Referring again to the second case where the sentence in a ward position No. 4. Mare specifically, the inserted 

(2a) "This is so hot that children cannot drink if is input, part-of-speech candidate No. 0 of the word "that" has a word 

when the dictionary lookup process is finished in substan- 60 number "1", a part of speech "AC", a type T indicative of 

tially the same manner as the first case, the morphemic a principal fixed portion, and a pointer "1/0" indicating that 

analysis is performed in step S7 to determine grammatical the part-of-speech candidate No. 0 of the word "mar" relates 

attribution such as number, person and tense of each of the to the part-of-speech candidate No. 0 of the word "is" in the 

wards - ward position No. 1. 

Then, the aforesaid steps S8 through SIS are performed 65 After the aforesaid process is completed, in step S16, a 

by the idiom processing section 34c. When the ward posi- syntactical analysis is carried out by the syntax analyzing 

tion counter is incremented to "2" (which indicates a word section 35. Mare specifically, the syntax of the source 
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scnicuCc is ucUxiuificu wiui icicfcucc tu uic syaLduiuu 
analysis grammatical rule 33c stored in the memory section 
33 shown in FIG. 3. The word sequence of the input source 
text has already been normalized by modifying (he dictio- 
nary lookup result buffer, and hence has a specially arranged s 
ward sequence. 

The normalization means that a fixed portion other than a 
principal fixed portion (supplementary fixed portion) is 
deleted from the split idiom included in the input word 
sequence. The arrangement of words of the input sentence is 10 
modified so that the split idiom may be considered to be an 
idiom having a single fixed portion. Therefore, no special 
grammatical rule for processing a sentence having an 
uncommon part-of-speech arrangement is required to pro- 
cess the postmodifi cation idiom. The syntactical analysis of is 
the sentence thus normalized is carried out in accordance 
with a commonly used grammatical rule, and then a syn- 
tactical analysis tree is generated. 

Alternatively, the arrangement of constituent words of the 
split idiom may be modified in accordance with another 20 
normalization process. Similarly to the aforesaid case, the 
split idiom can be syntactically analyzed in accordance with 
the commonly used grammatical rule. 

In this syntax analyzing process, it is checked that the 
part-of-speech candidate No. 1 of the word "have" and the 25 
part-of-speech candidate No. 1 of the word "and", for 
example, do not exist in the same syntactical analysis tree, 
with reference to the pointer indicative of compatible/ 
incompatible interrelation between words which is set in the 
dictionary lookup result buffer. 30 

After the syntactical analysis tree is generated by way of 
syntactical analysis, the parts of speech of variable portions 
represented by the representative symbols for idiom regis- 
tration are checked. In case of the idiom "both *N1 and__* 
*N2", for example, it is checked if the principal fixed portion 35 
"and" is interposed between noun phrases represented by a 
symbol "*N". If NO (the syntactical analysis is 
unsuccessful), the syntactical analysis tree employing this 
idiom is abandoned, because the syntax represented by the 
syntactical analysis tree cannot exist 40 

In the first case where the sentence (la) "I have both A and 
B" is input, the variable portions "A" and "B M are noun 
phrases as can be seen from the dictionary lookup result 
buffer A-l shown in FIG. 11 and, therefore, the syntactical 
analysis is successfully carried out 45 

Thereafter, the transformation process is performed by the 
transformation section 36 in step S17, and the generation 
process is performed by the generation section 37 in step 
S18. The translated expression in Japanese corresponding to 
the English idiom is obtained, then the variable portions so 
"*N1" and "*N2" are replaced with translated words "A" 
and "B", respectively, and a final translation for the entire 
source sentence is obtained. Then, the translation result is 
output to the CRT or printer in step S19. 

As can be understood from the foregoing, a supplemen- 55 
tary fixed portion is deleted from an input source sentence by 
modifying information retained in the dictionary lookup 
result buffer A with reference to a symbol indicative of a 
principal fixed portion included in an idiom, and the input 
source sentence is normalized so as to be processed in 60 
accordance with a standard grammatical rule. 

Therefore, a translation for the input source sentence 
including an idiom can correctly be generated. 

Further, since the syntactical analysis is carried out after 
the information retained in the aforesaid dictionary lookup 65 
result buffer A is modified in accordance with the present 
invention, the throughput of translation process can be 



sigimicaDuy improved, cumparcu wiui uic recursive trans- 
lation process of the prior art in which the translation is done 
over again when it is found that an applied rule is not 
appropriate after the translation of a variable portion. 
What is claimed is: 

1. A machine translation system having an idiom process- 
ing function, comprising: 

an input means for inputting a word sequence of a first 
language; 

a dictionary means for storing therein idioms of the first 
language including at least two fixed portions and a 
variable portion interposed therebetween as headers 
and translated expressions in a second language corre- 
sponding to the respective headers; 

a registration means for newly or additionally registering 
a header of the first language and translated expressions 
in the second language corresponding to the header into 
said dictionary means; 

a dictionary lookup means for retrieving a header corre- 
sponding to an idiom included in the word sequence of 
the first language input by said input means from the 
headers stored in said dictionary means by comparing 
the word sequence of the first language with each of the 
headers; and 

an idiom processing means for normalizing an arrange- 
ment of fixed portions in a word sequence of the first 
language which is identified with one of the headers by 
said dictionary lookup means. 

2. A machine translation system as set forth in claim 1, 
wherein said dictionary means stores therein the idioms of 

the first language in such a manner that a principal fixed 
portion of each of the idioms can be distinguished from 
a supplementary fixed portion thereof, and 
said dictionary lookup means generates part-of-speech 
information for each word and syntactical information 
including type inforrnation assigned to the fixed por- 
tions in the word sequence of the idiom identified with 
one of the headers for distinguishing the principal fixed 
portion from the supplementary fixed portion and 
pointer information indicative of interrelation between 
words in the input word sequence. 

3. A machine translation system as set form in claim 2, 
wherein said idiom processing means prepares informa- 
tion for presurning that a word corresponding to the 
supplementary fixed portion selected from words iden- 
tified with the fixed portions of the header of the idiom 
has been moved to a position of a word corresponding 
to the principal fixed portion within the input word 
sequence, and adds the information to a word other than 
the word corresponding to the supplementary fixed 
portion to transform the syntactical information when 
carrying out the narmalization. 

4. A machine translation system as set forth in claim 2, 
wherein said idiom processing means prepares informa- 
tion for presurning that a word corresponding to the 
supplementary fixed portion selected from words iden- 
tified with the fixed portions of the header of the idiom 
has been deleted from the input word sequence, and 
adds the information to a word other than the word 
corresponding to the supplementary fixed portion to 
transform the syntactical information when carrying 
out the normalization. 

5. A machine translation system as set forth in claim 2, 
wherein said idiom processing means prepares informa- 
tion for presuming thai words identified with the vari- 
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able portions of the header of the idiom has been moved 
to different positions within the input word sequence, 
and adds the information to a ward other than the words 
identified with the variable portions to transform the 
syntactical information when carrying out the normal- 5 
ization. 

6. A machine translation system as set forth in claim 2, 
wherein the idioms of the first language stored in said 

dictionary means include a split idiom having at least 
two fixed portion and a variable portion interposed 10 
therebetween! 

7. A machine translation system as set forth in claim 6, 
wherein, when a split idiom having first and second fixed 

portions and variable portion interposed therebetween 
is subjected to the normalization, said idiom processing 
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means prepares information for presuming that » word 
corresponding to the supplementary fixed portion 
selected from the first and second, fixed portions has 
been deleted, and adds the information to a word other 
than the word corresponding to the supplementary 
portion to transform the syntactic information. 

8. A machine translation system as set form in claim 2 t 
further comprising: 

a translation generating means for generating a translation 
from said syntactical information normalized by said 
idiom processing means; and 

an output means for outputting said translation. 

***** 
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